# RLHF reward model
RM Mistral 7B
A reward model trained on Mistral-7B for response quality evaluation in Reinforcement Learning from Human Feedback (RLHF) scenarios
Large Language Model
Transformers

R
weqweasdas
552
22
RM Gemma 2B
A reward model trained on google/gemma-2b-it for evaluating text generation quality
Large Language Model
Transformers

R
weqweasdas
2,618
25
Gpt2 Large Helpful Reward Model
MIT
A GPT2 large model trained on the Anthropic/hh-rlhf helpfulness dataset, specifically designed for helpful response detection or RLHF (Reinforcement Learning from Human Feedback).
Large Language Model
Transformers

G
Ray2333
2,935
11
Prometheus 13b V1.0
Apache-2.0
Prometheus is an evaluation-focused language model fine-tuned from Llama-2-Chat, excelling at assessing text quality against custom criteria, serving as a cost-effective alternative to GPT-4 evaluation.
Large Language Model
Transformers English

P
prometheus-eval
1,726
139
Featured Recommended AI Models